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Abstract 

I present an approach for modeling areal spatial covariance by considering the 
stationary distribution of a spatio-temporal Markov random walk. This stationary 
distribution corresponds to an intrinsic simultaneous autoregressive (SAR) model for 
spatial correlation, and provides a principled approach to specifying areal spatial 
models when a spatio-temporal generating process can be assumed. I apply the 
approach to a study of spatial genetic variation of trout in a stream network in 
Connecticut, USA, and a study of crime rates in neighborhoods of Columbus, OH, 
USA. 
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1 Introduction 


Almost all spatial data can be viewed as arising from a spatio-temporal generating process. 
For example, a spatial snrvey of infections disease prevalence is a snapshot of a dynamic 
epidemic process occnring in space and time. Similarly, spatial genetic data are the resnlt 
of spatio-temporal dispersal, mating, and snrvival processes at the popnlation level. When 
these spatial processes are observed at mnltiple snccessive time points, the known science 
behind the spatio-temporal process is often nsed to motivate a spatio-temporal statistical 


model (e.g., |Wikle fc Hooten||2010] [Cressie fc Wikle||2011[ ). 

In contrast, consider the case of “spatial” data, where only one temporal realization 
of the spatio-temporal process is observed. In this case, spatial antocorrelation is often 


modeled by inclnding a spatial random effect (e.g., Diggle & Ribeiro 2007) in the htted 
statistical model. The prior distribntion for this spatial random effect is almost always 
modeled semiparametrically nsing a Ganssian process model with covariance fnnction cho¬ 
sen based on the snpport of the data, irrespective of the spatio-temporal generating process. 
For example, when the spatial data are point-referenced, the Matern class of covariance 


fnnctions (e.g., Cressie 1993) are often nsed, while if the spatial data have areal or lattice 


snpport, then either conditional antoregressive (CAR; e.g., Besag 1974, Besag & Kooper- 


berg|1995 Rne fc Held)2005 ) or simnltaneons antoregressive (SAR: e.g., Wall|2004 Cressie 


& Wikle 2011) models are common. In either case, the choice of prior distribntion for the 


spatial random effect is almost always made based solely on the snpport of the data, withont 
consideration of an nnderlying generating process. 

Spatial data are poor in information relative to spatio-temporal data; however, we are 
increasingly able to collect large amonnts of spatial data. The increased information present 
in large spatially-correlated data provides an opportnnity for more realistic modeling of 


2 















































spatial covariance than has been possible in the past. Additionally, recent recognition of 
the potential for spatial confonnding (|Hodges fc Reich 2010, Paciorek[2010 Hnghes & Haran 


2013, Hanks, Schliep, Hooten fc Hoeting|2015) highlights the need to choose a spatial model 


with care, as the strnctnre of a spatially-correlated random effect can inflnence inference 
on hxed effects. 

I propose a general constrnctive approach to modeling spatial correlation based on 
considering the stationary distribntion of a spatio-temporal generating process. This spatio- 
temporal generating process can either be specihed based on scientihc knowledge, or can be 
thonght of simply as a device to constrnct a spatial correlation with desired properties, snch 
as anisotropy and nonstationarity. In Section 2, I describe the proposed general approach, 
and link it to cnrrent spatial models for continnons (geostatistical) random helds. In Section 
3, I focns on areal spatial models and show that the stationary distribntion for a spatio- 
temporal random walk model resnlts in a spatial SAR model, which provides a principled 
approach for choosing areal neighbors and SAR weights when spatial data can be seen 
as arising from a spatio-temporal random walk. In Section 4, I nse this development to 
model spatial genetic data based on a spatio-temporal random walk generating process. I 
apply this model to genetic data collected from front in the Jefferson-Hill Sprnce Brook 
in Connecticnt, USA. In Section 5 I present a second example by modeling crime rates 
for areal neighborhoods in Colnmbns, Ohio, USA. This second example illnstrates how a 
spatio-temporal generating process can be nsed to jointly model hxed and random spatial 
effects. In Section 6 I close with discnssion of the proposed approach. 


3 














2 A Constructive Spatio-Temporal Approach to Mod¬ 
eling Spatial Covariance 

The proposed approach is as follows. 

1. Dehne a deterministic spatio-temporal generating model for the spatio-temporal pro¬ 
cess y{s,t), where s indexes space and t indexes time 

^y(s,t) = J^(y(s,t)). (1) 

For example, could be a differential operator (e.g., in which case (1) is a partial 
differential equation (PDF). 

2. Drive the spatio-temporal process dehned by (1) with time-homogeneous spatial noise 
W(s) 

^y(s,t)=^(y(s,t))+W(s) , W(s)~iV(-,-). (2) 

The process (2) is now a random (stochastic) process in contrast to (1), which is 
deterministic. 

3. The stationary distribution 7r(s) = limt^oo y(-s, t) of (2) provides a spatial model 
capturing the dynamics of the spatio-temporal process. 

^y(s,t)=0 ^ J^(7r(s)) =-W(s) (3) 

Solving (3) for the stationary distribution 7r(s) can be done analytically in some cases, 
but in many others a numerical approximation will be required. 
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2.1 Spatio-Temporal Generating Models for Continuous-Space 


Spatial Models 


I first consider this approach in the context of continnous-space processes, and restrict at¬ 
tention to spatial processes in 7?.^, with the two dimensions s = {xi,X 2 )- The generalization 
to higher (or lower) spatial dimensions is straightforward ( Lindgren et al.||2011 ). The most 
common spatial covariance fnnction used in continuous space is the Matern class, with 
covariance function given by 

1 


cov(sj, Sj) = 




K„ 




r(z/)2^-i 

where dij = {xn — XjiY -\- (xj 2 — 2 :^ 2 )^ is the Euclidean distance between the spatial 
locations of the Tth and j-th observations, is the partial sill parameter, v is the Matern 
smoothness parameter, 0 is a range parameter, and Ky{-) is the modihed Bessel function 


of the second kind (e.g., Cressie 1993). 

As a special case of the constructive spatio-temporal approach proposed in the previous 
section, consider the random partial differential equation 


d 

—y{xi,X2,t) = (A - K^Y^‘^y{xi,X2,t) -h W(a;i, 0 : 2 ), 


(4) 


where A = -|- ^ is the Laplacian and W(a:i, X 2 ) is time-homogeneous spatial Gaussian 

white noise. Note that while equation (4) has been termed a stochastic partial differential 
equation (SPDE; Lindgren et ah 2011), I follow Kloeden & Platen ( |1992 ) and reserve 
SPDE to refer to a differential equation model driven by noise which varies over time (e.g., 
W(xi, X 2 , t) could be a spatial Wiener process), while a random partial differential equation 
(RPDE) is a differential equation driven by time-homogeneous noise, as in (2) and (4). 

The stationary distribution of (4) satishes the RPDE 

(k^ - A)“/^7r(xi,X2) = W(X1,X2), 
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whose solution is a random field of the Matern class (Whittle 1954, Lindgren et ah 2011). 


As a concrete example, consider (4) when = 0 and a = 2 


d 




+ 


dxl 


y{xi,X2,t) + W(a:i,a:2). 


(5) 


The spatio-temporal generating process (5) is a two-dimensional diffusion with time-homogeneous 
spatial sources and sinks defined by W(s). The corresponding stationary spatial distribu¬ 


tion is an intrinsic Matern random field with smoothness parameter u = 2 (See Lindgren 


et ah 2011, for details). 

The novelty introduced in this section is the temporal aspect of the RPDE approach, 
which is not necessary to use spatial models motivated by RPDEs. However, the inter¬ 
pretation of spatial models as stationary distributions of spatio-temporal RPDEs opens 
up the possibility of scientific modeling of spatial random effects when a spatio-temporal 
generating process, such as diffusion (5), can be assumed. Having shown how the Matern 
class of spatial models can be seen as the stationary distributions of the continuous space 
spatio-temporal RPDE (4), I now consider spatio-temporal models with discrete (areal) 
spatial support. 


3 Discrete Space Random Walk Models for Spatial 
Covariance 


Lindgren et ah (2011) consider discrete (areal) spatial models in the context of numerically 
approximating the solution to the RPDE (4) using a finite element basis set. Approximating 
a continuous spatial random field with a finite element approximation leads to increases in 
computational efficiency, as the finite element basis solution results in a Gaussian Markov 
random field (GMRF) with sparse precision matrix. 
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Instead of continuous spatial effects, consider modeling areal spatial processes directly. 
That is, consider modeling a spatial random effect y = [|/i, 1 / 2 ,..., ?/„]' on n spatial locations 
which constitute the full spatial support of the random effect. While both CAR and SAR 
models have been used extensively to model spatial autocorrelation in areal spatial models, 
there is little in the way of guidance on when to use one or the other, and little guidance 
on how to dehne the neighborhood structure that dehnes the CAR and SAR models (e.g.. 


Wall 2004, Assungao & Krainski 2009). 


By considering a Markov random walk on a discrete spatial support, I will hrst derive a 
population-level diffusion RPDE based on a large-population approximation to the spatial 
movements of many individuals. I will then show that the stationary distribution to this 
RPDE is a SAR model. This result will then be used in Section 5 and Section 6 to pro¬ 
pose spatial covariance models based on population-level spatial random walk or diffusion 
processes. 


3.1 Population-level Markov random walks 


Let G = (V,E) be a graph with vertices V = {W, V 2 , ■ ■ ■, Vm} and directed edges E = 
{aij^i = l,2,...,M;j = 1,2,...,M}. In particular, consider the case where aij is the 
exponential rate at which a random walker in node i transitions to node j. As in a standard 
continuous-time Markov chain (CTMC) model for a random walk, the time Ti spent by a 
random walker in node i before transitioning to any other node is exponentially-distributed 
with rate = Yl=i (^ik- 

Consider population-level processes on the graph G in which there are N members of 
the population, all behaving as a random walk. If there are ni{t) individuals at node i and 
time t, then the rate at which individuals move from node i to node j is given by niaij. 


Following Kurtz (1978) and Baxendale & Greenwood (2011), the normalized population 
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Figure 1; Continuous-time Markov random walk model example, aij is the transition rate 
from node i to node j and may be zero, indicating direct migration is impossible without 
traversing other nodes, bi is the rate at which individuals are introduced into the system 
at node i, and di is the rate at which individuals in node i are removed from the system. 

process z(t) = [ziit) Z2(t) ... zm] can be dehned as Zi(t) = ni(t)/N. 

In an open population model, individuals may enter (birth) or leave (death) the system 
continuously in time at any node (Figure 1). It is common to model the birth and death 
rates at node i as being density dependent, with birth rate of riib and death rate of Uid, for 
constant rates b and d shared across space. Instead, I will allow the birth and death rates 
to vary spatially, as this will provide a convenient mechanism for accounting for unmodeled 
spatial variation. To this end, consider birth and death rates that scale with the total 








population size {N). Let Nbi be the rate at which individuals are introduced into node i 
and let Ndi be the rate at which individuals in node i are removed from the system. 

To write a spatio-temporal model for the normalized population process z(t), it will 
be helpful to write each of the potential jumps (movement between nodes, births, and 
deaths) possible in this discrete system. If an individual is introduced at node i, then the 
population at i increases by 1. Notationally, represent this transition in the population 
process n as n —^ n -|- e*, where e* is the canonical vector with M componants, all of which 
are zero except for the Tth element, which is equal to 1. The jump in this birth transition 
is given by e^. Similarly, a death (removal) at node i decreases the population at node i 
by 1 and is given by the jump —e*. Spatial movement (transitions) from node i to node 
j, which occur with rate riiaij, have jumps given by — e^. The possible transitions with 
their rates are given in Table 1. 

Table 1: Transitions and Poisson rates in the continuous-time Markov population process. 


Description 

Transition 

Jump 

Rate 

Birth at node i 

n —!■ n -|- e* 


Nbi 

Death at node i 

n —n — 


Ndi 

Move from node i to node j 

n —n -|- — ej 

Gj 0^ 



Given an initial unnormalized population state n(0) at time zero, the transient distri¬ 


bution n{t) is given by (e.g., Baxendale & Greenwood 2011) 
n(t) = n(0)-l-^(e—ei)Pij j ni{s)aijds j 


NBAs 


- eiPi 


i-T iO 


Ndids 
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where 


Pij{a) ~ Pois{a), i = 0,1,... ,M; j = 0,1,..., M; i ^ j. 


The transient distribution for the normalized density z = n/iV is given by 

z{t) = z(0) + ^(e —f ni{s)aijds +^^6^ [W] - ^Pio [iVdit]") . (6) 

ij^o J ,• V / 


Taking the large population limit as iV —)■ cx) (Kurtz 1978, Baxendale & Greenwood 2011) 


gives the integral equation for the normalized density 

z(t) = z(0) + - ej) / Zi{s)aijds + ^ ei{bi - di)t. 

Jo 


(7) 




Details of this calculation are given in Appendix A. 

The differential equation associated with (7) is 

^ o'ijiej - ei)zi{t) + ei{bi - di) 
i^j i 

which has vectorized form 

= -Q'z(t) + (b - d) (8) 

where b = [ 6 i 62 ■ • • &M]^ d = [di <72 • • • Jm]'■, and Q is the inhnitessimal generator of the 
CTMC or the Laplacian matrix of the graph 


Q 


^ —Otl2 —ttl3 

— Ci2l —0!23 

— 031 —032 *^3fc 


Olm 
^ 2 m 
— O^Sm 


(9) 


\ Oiml (^m2 Cy.m3 ' ' ' ^"mk J 

Equation ( 8 ) specihes a graph diffusion process where b — d is a vector of net inputs 
and outputs to the system and —Q' is a matrix describing proportional rates of transfer 
between spatial locations. 
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3.2 Spatial Models Prom Random Walks 

To specify a spatial model motivated by the differential equation ( 8 ), consider modeling 
the spatial birth and death rates as spatial white noise 

7 = b — d ~ iV(0, (T^I) 

subject to the constraint that 1'7 = 0. This sum-to-zero constraint on 7 is necessary to 
ensure the existence of a stationary distribution tt for ( 8 ). The spatio-temporal differential 
equation ( 8 ) can then be written as the RPDE 

^z(t) =-Q'z(t)+ 7 , 7 ~iV( 0 ,a 2 l). (10) 

The stationary distribution tt for the normalized population process z satishes the balance 
equation that = 0, which implies that 

Q'tt = 7 , 7 ~iV( 0 ,(T^I) 

and thus the stationary distribution for ( 10 ) is given by 

77 ~ N{0, (QQ0~); with I'tt = 0. (11) 

This stationary distribution is a random held on the discrete spatial support of the 
population process z{t) with spatial covariance dehned by the spatio-temporal CTMC 
random walk with inhnitessimal generator Q (9). 

3.2.1 Links to Intrinsic Simultaneous Autoregressive Random Fields 

The random held in (11) corresponds to an intrinsic simulataneous autoregressive (SAR) 
model for spatial correlation. This correspondence provides an intuitive approach for speci¬ 
fying the SAR neighborhood structure in situations where some information is known about 
the spatio-temporal dynamics of the system being modeled. 
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The standard SAR model can be written (see e.g., Section 4.2.7 of Cressie & Wikle 


2011) as 




y ~ A^(0, (I-B)^^A(I-B 

where B has zeroes on the diagonal and A is a diagonal matrix with Ath diagonal Aj. 
Then setting 


a 




Brj = Y- 


and An = 


expresses (6) as an intrinsic SAR model. As in standard SAR models, the matrix Q from 
(4) does not have to be symmetric, bnt rather can incorporate models for asymmetric 
random walks. Additionally, if Q is sparse (many of the {caj} are zero), then sparse matrix 
methods (e.g., Rne fc Held||2005 ) can be employed to sample from and evalnate the density 
in (6). 


The SAR models (and related CAR models) have been viewed as nnintnitive (Wall 


2004). The spatio-temporal random walk motivation for the spatial model in (11) provides 


a principled framework for incorporating knowledge abont the spatio-temporal spread of a 
system into a model for spatial antocorrelation. 

The random held tt in (11) is an intrinsic random held, in that only linear combinations 
are proper ( |Besag fc Kooperberg 1995). An alternative formulation is that the density for 
TT is proper under the constraint that tt sums to zero over the spatial domain. Intrinsic 
random helds are often used as prior distributions, where the posterior distribution is 
proper. For example, consider modeling a Gaussian response y as 


y = fil + TT + e, e~ A^(0, r^I) 

where tt ~ A^(0, (QQ0~)) with Vtt = 0. Under this formulation, tt is constrained to sum 
to zero, but /il -|- tt is not. This formulation can be seen as a form of restricted spatial 
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regression (Hughes & Haran 2013, Hanks, Schliep, Hooten & Hoeting 2015) where the 
spatial random effect tt is constrained to be orthogonal to the intercept pi. 


3.2.2 Identifiability 

The likelihood of (11) 

/(ttIQ) oc exp | — -7r'QQ'7r| 

is a function of QQ^ rather than purely a function of the infinitessimal generator Q. Thus, 
if there are two generator matrices Q and W such that QQ^ = WW', then Q is not 
identifiable. However, the special structure required for a generator matrix of a CTMC 
allows us to prove that Q is identifiable in all but pathological situations. 

Theorem 3.1 //Q and'W are both generator matrices (9) for irreducible M-state CTMCs, 
and at least one row of Q has more than one nonzero off-diagonal entry, then QQ'=WW' 
if and only z/ Q = W. 

The proof is given in Appendix B. The significance of this result is that the only forms 
for Q that are unidentifiable come when the embedded chain of the irreducible CTMC 
governed by Q is deterministic and topologically the graph given by Q is a loop, with flow 
only possible in one direction (either clockwise or counter-clockwise). In all other graph 
topologies, identifiability is guaranteed. 


4 Example 1: Random walk models for spatial genetic 
variation on stream networks 

I now present two examples of spatial analyses using the assumption of a spatio-temporal 
random walk generating process leading to a spatial random effect. The first example 
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comes from landscape ecology, where a common goal is to understand how the landscape 
influences spatial connectivity or correlation. Random walk models are among the most 


common models for gene flow, both in theory and in practice. McRae (2006) showed that 
under a random walk model for migration, a common formulation of genetic dissimilarity 


(the linearized fixation index) was proportional to the circuit resistance distance (Klein & 


Randic 1993) between the nodes in question. Under the formulation of McRae (2006), the 


spatial domain is envisioned as a graph of spatial nodes with symmetric edge weights ay- 
proportional to the (symmetric) rate of random walkers between nodes. The resistance 
distance is the effective resistance in an electric circuit where the nodes are connected by 
resistors with resistance 1/ay equal to the inverse of the migration rate. This approach to 
studying gene flow is known as the isolation by resistance approach, and is often used to 
explore the relationship between landscape features and gene flow. 

While most studies addressing isolation by resistance choose between a finite number 


of pre-specihed edge weights (resistances). Hanks & Hooten (2013) modeled the observed 


pairwise genetic distance matrix using the generalized Wishart distribution of McCullagh 


(2009) with symmetric precision matrix Q (9) and made inference on the edge weights a 




as a function of landscape covariates. Instead of using the RPDE stationary distribution 


approach that gives rise to (11), Hanks & Hooten (2013) considered a variogram argument. 


as follows. Using links between symmetric random walks and electric circuits (Doyle & 


Snell 1984), McRae (2006) showed that under a random walk model for migration, a com¬ 


mon formulation of genetic dissimilarity (the linearized fixation index) was proportional to 
the resistance distance (Klein & Randic 1993). [Hanks fc Hooten ( |2013 ) showed that the 
resistance distance was exactly the variogram (expected squared difference) of an intrinsic 
Gaussian spatial random field with precision matrix Q. While this provides an interesting 
link between random walks and variograms, our goal in this analysis is to directly motivate 
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a spatial model by the stationary distribution of a spatio-temporal model, something not 


explicitly considered by Hanks & Hooten (2013). 

The isolation by resistance approach assumes symmetric edge weights (and thus sym¬ 
metric migration rates), though often it would be more realistic to assume asymmetric 
migration rates reflecting source and sink dynamics. As an example, consider the sys¬ 


tem studied by Kanno et ah (2011), consisting of trout in the Jefferson-Hill Spruce Brook 
in Connecticut, USA. 470 trout were captured at 173 spatial locations along the brook 
(Figure 2) and genotyped, with microsatellite allele data obtained at 15 loci. An isola¬ 
tion by resistance approach would require symmetric migration rates between upstream 
and downstream locations, but a more realistic model (which I will propose) would con¬ 
sider asymmetric migration rates reflecting the potentially increased difficulty in moving 
upstream relative to moving downstream. 


Additionally, Kanno et ah (2011) examine the effect of two seasonal blockages of the 
brook - two locations where the brook dries up and is seasonally impassible to the trout. 
The hypothesized drivers of gene flow and genetic connectivity among the trout population 
on Jefferson Hill Spruce Brook are both directional (differential rates of movement upstream 
and downstream) and non-directional (decreased connectivity between stream locations on 
opposite sides of the seasonal blockages). If spatio-temporal trout movement data were 
available, modeling these directional and non-directional responses to covariates would be 


straightforward (Hooten et ah 2010, Hanks, Hooten & Alldredge 2015). For example, 
movement could be envisioned as occuring on a graph with a node at each spatial location 
where trout were sampled, and edge weights equal to random walk transition rates between 
nodes could be modeled as 

^exp {/do + + (32Vij} if nodes i and j are neighbors 

0 otherwise 


Oiij 


_ / ^^3 


( 12 ) 
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Figure 2: Trout sampling locations on the Jefferson Hill Spruce Brook. 


where {ujj} and {vij} are indicator variables with Uij = 1 if node j is downstream from node 
i and Vij = 1 if a seasonal blockage is located between nodes i and j. In this formulation, 
each node on a branch of the stream network has two neighbors, one upstream and one 
downstream, and edge weights aij are zero for all other non-neighboring nodes. Each node 
at a confluence of two stream branches will have three neighbors, one downstream and two 
upstream. The rate at which a random walker at a node i on a branch of the stream network 
transitions to the nearest upstream node j is aij = l/(Jjjexp{/do} if there is not a seasonal 
blockage between nodes i and j. Similarly, the rate at which the random walker transitions 
from i to the nearest downstream node k is aik = I/dikexp{/3o + (di}. The parameter (32 
models the additive effect that a seasonal blockage has on log-transition rates. Together, 
this simple random walk model allows for transition rates that vary with direction and 
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location based on known spatial stream characteristics. 

A spatial model for the observed microsatellite allele data could then be specihed with a 
latent spatial autocorrelation modeled using the stationary distribution (11) of the random 
walk model (12) when driven by time-homogeneous white Gaussian noise, as described in 
Section 3.2. 

Microsatellite allele data were observed at L = 15 distinct loci for each spatially ref¬ 
erenced trout captured. At the locus, I = 1,2 ,...,L, denote the list of all distinct 


observed alleles from all individuals in the study as {a^i, 0 ^ 2 , ■ • •, Following Guil- 


lot et al. (2005) and others, I model the two observed alleles for each (diploid) individ¬ 


ual as arising from a multinomial distribution with spatially varying allele probabilities 
Vst = {psn Vsi 2 ■ ■ ■ VsiKf)' 1 where s G {1, 2,..., 5} indexes the spatial location. 

Let HsipEk = 1 if the (indexing ploidy) observed allele at the locus is a^k for the 
individual at the spatial location, and Usipik = 0 otherwise. Then the multinomial 


probit model (e.g., Albert & Ghib 1993) for categorical data is often specihed in terms of 


latent variables, z, as follows. Let 

1 5 ^siplk ® f) ■ ■ ■ ) 

0 , otherwise 

where 


Vsipik 


(13) 


^siptk ~ ^{,1-^lk T T]slki !)• 

Then the allele makes up a fraction Psik of the genetic makeup of the subpopulation at 
location s, where 

Psik = Prob {zsiptk = m.a.yi{zsipia, a = 1,..., Ki]) 

The mean of the latent variable Zgip^k in (14) consists of the sum of two effects. The hrst 
is /i£fc, an allele specihc intercept which determines the relative frequency of the allele 
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(a) Upstream Rate Po 
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(b) Downstream Adjustment Pi 
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(c) Seasonal Barrier Effect P 2 
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Figure 3: Posterior histograms of random walk model parameters in the spatial genetic 
analysis of trout in the JeffersonHill Spruce Brook. 

at the locus across the entire population being studied. Large values of relative to 
/i£fc/ make it more likely that Zgipek will be larger than Zgipek', and so the allele will be 
more prevalent than the {k'Y^ allele. Note that the model (13)-(14) is invariant to a shift 
in all /i£fc, as the likelihood is a function of the contrasts Zsiptk — Zsiptk', and not the actual 
values of Zsipek- Thus, if fiek were replaced by Hek + c for /c = 1, 2,..., and some constant 
c, the likelihood of the observed allele data would remain unchanged. To maintain model 
identihability, £x = 0 for ^ = 1, 2 ,..., L, as only the relative differences (contrasts) in 
fiek are identifiable. 

The second term in the mean of (14) is f]sek, which is a spatially varying random effect 
that allows the allele frequencies to vary over the stream network. Following the rea¬ 
soning in Section 3.2, the spatial random effects are modeled as the stationary distribution 
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of a random walk process driven by time-homogeneons noise. Let 


~Af(0,(QQ')-‘), 1V = ') (15) 

where Q is the inhnitessimal generator (9) of the random walk with transition rates (12). 

The model is completed by specifying diffuse Gaussian priors for the random walk 
parameters /3o, /9i, 1^2 and the allele specihc intercepts 

~iV(0,102), j = 0,l,2 (16) 

/i,, ~7 V(o,102), £ = 1 ,2 ,...,L; k = 2,3,..., K^. (17) 

A Markov chain Monte-Carlo (MCMC) algorithm was constructed to sample from the 
posterior distribution of model parameters, given the observed microsatellite allele data. 
Fifteen chains were run with different starting values. Each chain was run for 10® iterations, 
with the hrst 10® samples discarded as burn in. Convergence was assessed by comparing 
posterior histograms obtained from only the hrst half of each chain with posterior his¬ 
tograms obtained from only the second half of each chain. Histograms of the marginal 
posterior distributions of the random walk parameters are given in Figure 3. The posterior 
distribution for /3i is greater than zero, indicating that the data support the anisotropic 
hypothesis that gene how is more rapid downstream than upstream. The posterior distribu¬ 
tion for [32, which captures the ehect of the seasonal blockages, overlaps zero (Figure 3(c)), 
with the 95% equal-tailed credible interval being bounded by (—2.4, 2.2). This indicates 
only weak support (if any) for the hypothesis that the seasonal blockages ahect gene how. 


Vek = 


VUk 

V2ek 

Vn£k 
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Simulated Random Effect 


Simulated Random Effect 



892000 

694000 

696000 

898000 

900000 

892000 

694000 

696000 

898000 

900000 



Easting 





Easting 




Simulated Random Effect Simulated Random Effect 



892000 694000 896000 898000 900000 892000 894000 696000 898000 900000 


Easting Easting 


Figure 4: Four realizations of random fields on the Jefferson-Hill Spruce Brook simulated 
using posterior mean parameter values of the random walk covariance model. 

Posterior distributions for the allele specihc intercepts are not shown, and posterior 
mean values for the intercepts ranged from —2.2 to 0.7. To qualitatively illustrate the ge¬ 
netic correlation structure implied by the estimated parameters, four realizations of random 
helds on the stream network were simulated using the posterior mean parameter values. 
These random helds are shown in Figure 4. The constructive spatio-temporal approach 
proposed here provides a valid autoregressive spatial model for data collected on a stream 
network. In contrast, ? present a moving average (convolution) approach to modeling 
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spatial autocorrelation on stream networks. 


5 Example 2: Crime rates in Columbus, OH. 


A second example illustrates how considering a spatio-temporal generating process can 
provide insights into modeling the interplay between mean and covariance structure in 
spatial models. As mentioned previously, recent recognition of the potential for spatial 
confounding (e.g., Hughes fc HaranpOlS , Hanks, Schliep, Hooten fc Hoetingj2015 ) suggests 
that correctly modeling the relationship between the fixed and random effects in a model 
is important, even if we only desire to interpret the relationship between hxed effects and 
the response. 

Consider the case of 1980 crime rates in 49 neighborhoods in Columbus, Ohio, USA 
( Ansehn|1988 ). Figure 5(a) shows the number of residential burglaries and vehicle thefts per 
thousand housholds in each of the 49 neighborhoods. Figure 5(b) shows the average value 
of homes in each neighborhood, in thousands of dollars. These data are freely available in 


the ‘spdep’ package (Bivand & Piras 2015) of the R statistical computing environment 


Core Team 2015). 


A preliminary linear regression with crime rates as response and average home values as 
predictor variable indicates a negative correlation between average home values and crime 
rates. However, the residuals from this simple linear regression are shown in Figure 5(c) 
and show clear spatial autocorrelation. A standard spatial analysis might consider the 
following spatial linear model 


c = + j3\i + ar) + e, e ~ iV(0, r^I) 


(18) 


17 ~ Af(0, (QQ')“ ) with lr 7 = 0 


(19) 
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where c is a vector of the 1980 crime rates, h is a vector of average home values, 77 is 
a spatial random effect with SAR structure dehned by Q, and e is nonspatial error. A 
symmetric neighborhood graph was dehned with edges between all polygons that share a 
polygon edge, as shown in Figure 5(d). If neighborhoods i and j are neighbors, say that 
i ^ j or, equivalently in this symmetric relationship, j ~ i. The matrix Q in (19) then has 
elements 


Qij 


-1 
< 0 


, j 

, if i 7 ^ J, i'y^ j ■ 


Thus 77 is an intrinsic spatial random effect with precision matrix Q^. Heuristically, 77 is 
a missing covariate that is spatially smooth on the support of the 49 neighborhoods in 
Columbus. 

Now contrast this purely spatial approach with an approach based on considering a 
spatio-temporal graph diffusion generating process. As noted in Section 3.1, the differ¬ 
ential equation ( 8 ) resulting from the large N limit of the population-level random walk 
process is a diffusion process dehned by a vector of inputs to the system and a matrix — Q' 
encoding rates of transfer between spatial nodes in the graph. In this spirit, consider a 
process where the inputs (sources and sinks) are random variables with mean dehned by 
the predictor variable (average home value) and spatial dihusion rates dehned by the spa¬ 
tial neighborhood graph. Note that while this is not a science-based mechanistic model for 
crime in Columbus, it does provide two competing models for how crime rates are related 
to average home values. In the standard spatial model, the spatial random ehect 77 is a 
missing covariate unrelated to average home values h. In the graph dihusion based model 
presented below, a dihusion process spatially smooths the ehect of h, similar to a moving 
average (or convolution-based) spatial model (e.g.. 


Lee et al. 2005 
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As an alternative to the standard spatial mixed effect model in (18)-(19), consider 
modeling crime rates (c) as 

c = /il + TT + e (20) 

where tt is the stationary distribntion of the spatio-temporal graph diffnsion process z{t) 
defined elementwise as 

dz(t) 

—ff— = -i^mziit) + ^ + f3 ■ hi + 6i. (21) 

The first term on the right hand side of (21) defines the flow ont of node i to the n, = ^ 

neighboring nodes. The second term defines the flow into node i from other nodes. The net 
inpnt/ontpnt from “births” and “deaths” into node i is phi + 6i. The intnition here is that 
the spatial sonrce of crime in Colnmbus neighborhoods is correlated with home valnes, and 
that crime spreads spatially ont from neighborhoods with high crime rates to neighboring 
regions, with a constant diffnsion rate of k between all neighboring nodes. 

If 6i are modeled as independent zero mean Ganssian random variables, the RPDE can 
be written in vector form as 

^^ = -KQ'z{t)+(3h + S, S^N{0,a^l) (22) 

and the stationary distribntion tt satisfies 

kQ'tt = / 3 h -|- S, 


or, eqnivalently 


7 r~iV( ^(Q')-'h,-(QQ')-' 

K K 


I'tt = 0 


where (Q')-' is the Bott-Dnffin constrained generalized inverse ( |Bott fc Dnffin||l953[ ) of 
Q'. The data model (20) for the graph diffnsion spatial model can then be written as 

c = fil + ^{Q')~^h +ari + e (23) 
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with 13 = (3/K, a = a/K, and 77 a random effect defined as in (19). Without strong prior 
information, k will be unidentifiable. Instead, consider inference on (3 = (3 /k and a = a/ 
which are identihable. In this formulation, the only difference between the standard spatial 
model in (18) and the graph diffusion based spatial model in (23) is that the hxed effect h 
in (18) is smoothed by (Q')~^ in (23). 

Within a Bayesian framework for inference, I assigned flat Gaussian priors to the regres¬ 
sion parameters p, (3, and (3. Flat half-normal priors were chosen for the spatial random 
effect variance parameters a and a, and an inverse-gamma prior was chosen for the non- 
spatial error variance r^. Inference on the parameters in (19) and (23) was obtained by 
a Markov chain Monte Carlo sampler. In each case, the MCMC sampler was run for 10^ 
iterations. Convergence was assessed by comparing histograms of samples from the hrst 
half of the Markov chain with histograms of samples from the second half of the Markov 
chain. 

Posterior means and 95% credible interval bounds are shown in Table 2. To compare 


models, I computed the Deviance information criterion (DIC, Spiegelhalter et ah 2002). 
Posterior distributions for /i and f3 from the spatial model (18) are similar to those of 
/i and (3 from the graph diffusion model (23); however, the standard deviation a of the 
spatial random effect 77 in the spatial model (18) is larger than the corresponding standard 
deviation a in the graph diffusion model (23). This indicates that the need for the spatial 
random effect is greater in the spatial model than in the graph diffusion model where the 
home value covariate was smoothed by (Q')“^. The DIC of the graph diffusion model 
(DIC=411) was lower than that of the standard spatial model (DIC=442), indicating that 
in this case, considering a spatio-temporal generating process resulted in a better model £t 
than would be obtained by the inclusion of a standard spatial random effect. 
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Table 2: Posterior results for parameters in the spatial and graph-diffusion based models 
for crime in Columbus, OH neighborhoods. The graph diffusion model fits the data better 
as measured by DIG._ 


Parameter 

Post. Mean Post. 0.025 Quantile 

Post. 0.975 Quantile 

Spatial Model (18) DIG = 

442.10 



p 

35.12 


32.09 

38.12 


-9.28 


-12.48 

-6.16 

a 

1.81 


0.31 

3.50 

T 

10.75 


8.86 

13.04 

Graph Diffusion Model (23) 

DIG = 

411.52 



35.13 


31.89 

38.33 


-9.38 


-12.89 

-5.92 

a 

0.94 


0.03 

2.67 

T 

11.51 


9.68 

13.75 


6 Discussion 


While we have focused on discrete space models, this general approach has potential for ap¬ 
plication in continuous space as well. Spatial deformation approaches to nonstationary co- 
variance (e.g., Schmidt fc O’Hagan||2003 Lindgren et al.||2011 ) can be viewed as stationary 
distributions of diffusion processes with spatially heterogeneous diffusion rates. Reaction- 
diffusion models are common in ecology and other fields (e.g.. Keeling et al.||2004 Hu et ah 


2013) and would provide a natural spatio-temporal generating process basis for spatial 


random effect models in a wide variety of systems. Finite element basis and grid-based ap¬ 
proaches to approximating continuous spatial fields have a long history in spatio-temporal 
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(e.g., Wikle & Hooten 2010) and spatial (e.g., Lindgren et al. 2011) analysis, and could 
be used to approximate the stationary distribution of a continuous (infinite-dimensional) 
spatio-temporal generating process with a hnite number of basis functions. 

Current standard approaches to modeling spatial correlation focus on nonparametric 
random effect models. This work proposes a parametric constructive approach to modeling 
spatial random effects based on an assumed spatio-temporal generating process. The two 
examples give some indication of how this approach may be used. In the hrst example, 
existing scientihc knowledge about the system (gene flow on a stream network) was used 
to specify a spatio-temporal generating model (a population-level random walk), and the 
stationary distribution of this spatio-temporal process dehned the distribution of the spatial 
random effect used to model genetic correlation. In the second example, a descriptive 
approach was taken to compare multiple models for spatial variation. In particular, for 
the Columbus crime data, the graph diffusion model provided a better model £t than 
was obtained using a standard spatial random effect model. Modeling spatial random 
effects nonparametrically is the current standard practice; however, there are benehts to 
parametric modeling of spatial random effects when the existing science can suggest a 
spatio-temporal generating mechanism. 


Appendix A: Large population limits of population pro¬ 
cesses 


The interested reader is referred to Kurtz (1981) for a full treatment of stochastic population 


processes. This derivation follows the spirit of Kurtz (1981) and Baxendale & Greenwood 


(2011), but with the novelty of birth and death rates that are not density dependent. 
Following from (6) in Section 3.1, the transient distribution for the normalized density 
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z = n/7V is given by 

z(t) = z( 0 ) + - ei)^Pij [ ni{s)aijds + ^e* (^Poi [Nbit] - ^Pio 

ijf^o Uo i i \ J 

where 

Pij[a) ~ Pois{a), i = 0,1,..., M; j = 0,1,..., M; i 7 ^ j. 

Note that 

PijiS^ n “1“ (yPiji^odj (x) 

= a + Wij{a) , hhij(a) ~ ( 0 , a) 

where each Wij has mean zero on constant variance. Applying this to the transient distri¬ 
bution gives 

If/'* 1 - 

z(t) = z{0) +- Bi) — / ni{s)aijds -f {bjt - djt) 

Arl-LCi L^O J 


ij^O 

- ei)Wij [ ni{s)aijds + ^ e* (hho* [Nht] - Wio [Ndit]) j . 

Uo J , J 

Consider a hxed t > 0 and note that N > ni{s) for all s G (0,t). This gives the result 


1 

N 


that 


ni{s)aijds < Naijt. 


Then to show that all terms above including random variables Wij disappear in the limit 
as A^ —)■ cx), it is enough to consider the behavior of 

^W{Na), fT(a)~(0,a) 


for a constant a > 0. It is trivial to note that 


E 


-W{Na) 


= 0 
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and that 


V ar 


-W{Na) 


1 


Na 


which vanishes in the limit as iV —)■ cx). 

Then, in the large population limit, the transient distribution of the normalized popu¬ 
lation z(t) will be given by 


z(t) = z(0) + ^(e^ 




ed — 


Uo 


ni{s)aijds 


-h ^ Oi {bit - dit). 


Appendix B: Proof of Theorem 3.1 

In this appendix, we prove Theorem 3.1. The proof follows from the fact that QQ' is a 
Gramian matrix (e.g., ?) and thus QQ^ = WW^ if and only if W = QU' for a real unitary 
matrix U'. As W and Q are both generators for CTMC random walks, their rows sum to 
zero (Q1 = W1 = 0 ), with negative diagonal entries {qu < 0 , wu < 0 ) and non-negative 
off-diagonal entries {qij > 0, Wij > 0 for i 7 ^ j). If Q and W are both generators for 
irredicible CTMCs, then both matrices have rank n — 1 and their null spaces are both 
spanned by the 1 vector. As W 1 = 0 , it follows that QUA = 0 and thus U'l = A1 for 
some A. The eigenvalues of any unitary matrix U' have absolute value equal to 1, so A 
either equals 1 or —1. If u' is the Ath row of U', then u'l equals either 1 or —1, but since 
U is unitary, u'u* = 1. These requirements both hold if and only if Uj = Ae^, where is 
the canonical vector with k-th element equal to 1 and all other elements equal to zero. As 
U is of full rank, the rows of U' must contain a full set of canonical vectors spanning TZ^. 

First consider the case where A = 1. Then U' is a permutation matrix, with the columns 
of W being permuted columns of Q. However, as W and Q are generator matrices, each 
diagonal entry of W and Q must be negative, while all off-diagonal entries are non-negative. 
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This can only hold for W if the permutation matrix U' is the identity matrix, and thus 

W = Q. 

Now consider the case where A = —1. Again U' permutes the columns of Q, but now 
the sign of all entries is changed through multiplication by A = —1. So Wa = —qik and 
Wik = —qu for some k. As W is a generator matrix, wa = which is only 

possible if qik is the only non-zero off-diagonal entry in the Ath row of Q. This completes 
the proof. 
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Figure 5: Observed 1980 crime rates (a) and average home values (b) in 49 neighborhoods 
in Columbus, Ohio, USA. The residuals (c) from a simple linear regression of crime rates 
on average home values show clear autocorrelation. A standard spatial analysis might 
include a spatial random effect with SAR neighborhood structure (d) to account for the 
spatial autocorrelation in the data. We contrast this with a graph diffusion based approach 
to jointly modeling spatial autocorrelation and the effect of the spatial covariate (average 
home values). 
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